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Abstract 

We adapt arguments concerning entropy-theoretic convergence from 
the independent case to the case of FKG random variables. FKG sys- 
tems are chosen since their dependence structure is controlled through 
covariance alone, though in the sequel we use many of the same argu- 
ments for weakly dependent random variables. As in previous work 
of Barron and Johnson, we consider random variables perturbed by 
small normals, since the FKG property gives us control of the result- 
ing densities. We need to impose a finite susceptibility condition - 
that is, the covariance between one random variable and the sum of 
all the random variables should remain finite. 

1 Introduction and notation 

Gnedenko and Korolev [|J] discuss the relationship between probabilistic limit 
theorems and the increase of entropy, saying that 

The formal coincidence of definitions of entropies in physics and 
in information theory gives rise to the question, whether analogs 
of the second law of thermodynamics exist in probability theory. 

It is indeed striking that whilst the principle of increase of physical entropy is 
taken for granted, the increase of information theoretic entropy is much less 
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well understood. The Gaussian is both the distribution of maximum entropy 
(under a variance constraint) and the limit distribution of convolutions in 
the Central Limit regime (which preserves variance). There is clear physical 
interest in asking questions such as whether entropy always increases on 
convolution, and whether it tends to this maximum. 

By showing that the entropy tends to its maximum, we prove the Central 
Limit Theorem in a stronger sense than classical methods achieve. This form 
of convergence implies the classical weak convergence proved in Theorem 2 of 
Newman Lemma 5 of Takano [16] and Theorem 3.1 of Carlen and Sof- 



fer also only prove weak convergence, though under different conditions. 
Furthermore by understanding how score functions become more linear on 
convolution, we gain an insight into the workings of the limit theorem, and 
why the Gaussian is the limiting distribution. We are able to gain some in- 
sight into the relationship between maximum entropy distributions and limit 
theorems in this way, and see the Central Limit Theorem in a new light. 

Gnedenko and Korolev propose a programme to investigate the relationship 
between results like the Central Limit Theorem and maximum entropy dis- 
tributions. This programme has been developed by Brown 0, Barron |l|, 
Johnson [H and Barron and Johnson |7J , who have used information-theoretic 
techniques to prove the Central Limit Theorem. These papers only deal with 
the case of independent random variables, || and JU in the case of identically 
distributed variables, and || and J7| for non-identical variables satisfying a 
Lindeberg-like condition. 

This paper extends these results and develops new techniques to consider the 
case of dependent random variables satisfying the FKG inequalities. The fact 
that proofs of entropy-theoretic convergence have only previously existed for 
independent variables is unfortunate, particularly given the natural physical 
interest in dependent systems. In extending Barron's work, we have shown 
the link between physical and information-theoretic entropies holds in more 
generality than Gnedenko and Korolev discussed. 

Definition 1.1 A set of random variables {X\, X 2 , . . . X m } is said to be 
FKG if Cov(F(X 1} X 2 , ■ ■ ■ , X m ), G(X U X 2 , . . . X m )) > for all increasing 
functions F, G. 

FKG (Fortuin-Kastelyn-Ginibre) inequalities hold for many physical models 



with 'positive correlation', as discussed by Newman [13]. For example, in 
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the Ising model with Hamiltonian H = ^2j k J(j — k)XjXk — hJ^Xj, the 
FKG inequalities hold if J(r) > for all r. Further, FKG inequalities hold 
for percolation models, where Xj = /(vertex j is in an infinite cluster) and 
Yukawa models of Quantum Field Theory. 

To obtain convergence in relative entropy we use de Bruijn's identity, which 
relates the relative entropy to Fisher information of perturbed random vari- 
ables, which have densities we can control. 

Definition 1.2 For a random variable U with variance a 2 and smooth den- 
sity f, we consider the score function p{u) = f'{u)/ f{u), the Fisher infor- 
mation J{U) = Kp 2 (U), and the standardised Fisher Information J st (U) = 
a 2 J(U) - 1 = E(ap(U) + Ufa) 2 > 0. 



Lemma 1.3 (de Bruijn) ForU with mean 0, variance 1 and density f , the 
relative entropy distance D(f\\<f>) from a standard normal can be expressed in 
terms of the Fisher information ofU perturbed by normals ~ N(0,t): 



Lemma 3 of Newman |13| shows that for (S, T) FKG, we can control 
Cov(/(S f ), g(T)). In our case this is useful because for Z^\ are normal 
N(0,t), independent of S,T and each other, this means we can control the 
densities px,y — PxPy, where (X, Y) — (S + Zg,T + Z^). See Lemma |3T5 



for a discussion of these methods. 

First we establish conditions under which convergence J B t(U) — > holds, 
which implies more conventional forms of convergence: 

Lemma 1.4 (Shimizu [|14j| ) If U has variance a 2 , density f and distri- 
bution function F then denoting the density and distribution function of a 
N(0, a 2 ) by (ft and $ respectively: 



sup | < / \f(u)-i>(u)\du < WWJst{U) 

sup|/(n)-0(n)| < ll + ^y/jJU) 



71 
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Indeed weak convergence implies that lim n Mh(S n ) = Eh(Z), for all bounded 
uniformly continuous functions h. Convergence in relative entropy extends 
this to the class of measurable functions bounded by some multiple of x 2 + 1 
(see Barron |jj for further details). 

Definition 1.5 Consider a stationary d- dimensional system of random vari- 
ables X u (where the index u e Z d ), with mean zero and finite variance. For 
a particular vector x = (wi, w 2 , . . . , Ud), we define the box 

B u = {y : < yi < Ui for all i}. 

with volume |u| = \B U \ = Yli u i- We can define v(x) = Var (Y^ u( z B ^X u ) 
and U x = (^2 ueBx X u )/a/|x|. Define perturbed random variables Yu = 



X u + Zu , for Zu a sequence of N(0,r) independent of X u and each other. 
We introduce V^ t) = (Eues x y u T) )A/M ~ U * + z{< 



Definition 1.6 For function ip, define the class of random variables X with 
variance a 2 such that: 



Condition 2 (Uniform Integrability) There exists ip such that Vu e 
for all u. 

Theorem 1.7 Consider a stationary collection of mean zero, finite variance 
random variables X u obeying the FKG inequalities and finite susceptibility 
(Condition^). Then 




Condition 1 (Finite Susceptibility) 




u 



C i) = {X: EX 2 I{\X\ > Ra) < o 2 i)(R) for all R}. 




if and only if Condition^ (Uniform Integrability) holds. 



4 



Condition |2| (for stationary FKG variables with finite variance) is actually 
implied by Condition [I]. This follows by Newman's proof |13| that these 
conditions imply the Central Limit Theorem, since if F n (x) is the distribu- 
tion function of vt\ then F n (x) so / z 2 I(\z\ > N)dF n (z) = 1 - 
/ z 2 dF n {z)I{\z\ < N)dz -»■ 1 - / z 2 I{\z\ < N)d$(z) = J z 2 I{\z\ > N)d${z). 
Carlen and Soffer claim on Page 369 that if Cov(X , Xi) decays at a rate of 
| i | * , where t > 2d, d the dimension of the lattice, then Condition will hold. 
This roughly corresponds to requiring that Cov(X , X{) 1 / 2 < oo. 

Note: we do not need to assume that the Xi themselves have densities - even 
if not, by the following Lemma we obtain weak convergence of the normalized 
sums of the original variables. 

Definition 1.8 Define k(ti,t) = supi u i >n 

Condition 3 For some n, J k(ji, t)/(1 + r)dr is finite. 

Theorem 1.9 Consider a stationary collection of mean zero, finite variance 
random variables X u with densities, obeying the FKG inequalities and Con- 
ditions^ and^. Then if g u is the density of Vu\ then: 

D{g u U) - 0, 

if and only if Condition^ (Uniform Integrability) holds. 

Proof Via monotone convergence: n(n, r) converges monotonically to in 
n, and hence J k(ti, r)dr converges to zero. □ 

Newman claims that if instead of scaling by |x|, we scale by v (x), Condition 
[I] can be relaxed to Condition Q and Condition [|. 

Condition 4 If K(R) = ^2\j]<RCov(Xo,Xj), then K(R) is slowly varying 
(that is, for any \, lim.ji-> 00 K(\R)/K(R) — 1). 

He remarks that Condition |^ can be checked if for example E(V"i T ' ) ) 4 < 
3 (E(\4 (r) ) 2 ) , which itself holds in many cases as a consequence of results 



such as the Lebowitz inequality [[J or the GHS inequality [12 . 
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Takano |[L5|| , [fTT^I deals with the behaviour of entropy, under a ^-mixing con- 
dition which seems hard to check in most useful cases, since it is defined by 
a ratio of densities. Further, Takano only proves convergence of in relative 
entropy of the 'rooms' (in Bernstein's terminology), equivalent to weak con- 
vergence of the original variables. Our conclusion holds in the stronger sense 
of relative entropy convergence of the full sequence. Another paper to use 
entropy-theoretic methods in the dependent case is by Carlen and Soffer @ . 
They use a variety of conditions which are different to ours, but again only 
prove weak convergence for dependent variables. 



2 Fisher Information and convolution 



In the independent case, Fisher information is a sub-additive quantity on con- 
volution. In the dependent case, we prove that Fisher information is 'almost 
sub-additive' - the interest comes in trying to bound the error term. Takano 



15 1, [16 1 produces bounds which depend on his 64 mixing coefficient, which is 



hard to understand, and hard to check since it depends on ratios of densities. 
Our calculations provide weaker, and more standard conditions under which 
the CLT will hold in the sense of convergence of Fisher Information. 



Definition 2.1 For random variables X , Y with score functions px, Py, for 
any (3, we define p for the score function of y/]3X + y/1 — (3Y and then: 

A(X,Y,f3) = E (v^PxPO + x/T^PpyiY) - p (v^X + y^l^Y^ > 0. 



The principal theorem of this section is: 



Theorem 2.2 Let S and T be FKG random variables, with mean zero and 
variance < K. Consider Zg and z!f\ distributed as N(0,t), indepdendent 
of S, T and of each other. Define X = S + Z^ and Y = T + Z^\ with score 
functions px and py- There exists a constant C = C(K,r,e) such that for 
any f3: 

(3J(X)+(1-(3)J(Y)-J (v//3X + y/l - (3Y^j+CCov(S, T) 1 ^ > A(X,Y,{3). 
If S, T have bounded (2 + 5) th moment, we can replace 1/3 by (2 + 5) /(6 + 5) . 
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Proof The proof of the first result requires some involved analysis, and is 
deferred to Section |3|. □ 



Next, we need lower bounds on the term A(X, Y, (3). As discussed in Barron 
and Johnson pj, in the case of independent variables, such terms are equal 
to zero exactly when all the functions concerned are linear. In general, if 
such an expression is small, then the derivatives of px and py are close to 
constant, so long as we have uniform control over the tails of X and Y . 

Proposition 2.3 For any ip, there exists a function v = v^, with v(e) — > 
as e — > 0, such that if X, Y lie in C^, then 

(3(l-f3)J st (X)<v(A(X,Y,f3)). 
Proof Define a semi-norm || ||g on functions via: 

||/|H = inf E(f(Z T/2 )-aZ T/2 -b) 2 , 

a,b 

where Z t/2 is iV(0,r/2). 

Using Lemma 3.1 of Johnson ||, for K > 0, there exists a constant > 
such that for any dependent random variables (S, T) with variances < K then 
the sum (X, Y) = (S + Z { s r\ T + Zff ] ) has joint density p( T \x, y) bounded 
below by ^jf0 T /2(^)0T/a(y)- 

Hence writing h(x, y) = y//3p x (x)+y/l - j3p Y (y)-p (y/]3x + a/1 - Py), then: 

A(X,Y,/3) = J p {r \x,y)h(x,y) 2 dxdy > £ K J (p r/2 (x)(f) T/ 2(y)h(x,y) 2 dxdy 

/3(1-/3)£ K , 2 2 x 
> 2 {\\Px\\e+ \\PY\\e) , 

by Proposition 3.2 of 0. The crucial result of Johnson implies for a fixed 
-0, if the sequence X n e C$ have score functions f n , then ||/ n ||e - * implies 
that J st (X n ) ->• 0. □ 

Define 

J(n)=sup{J Bt (V«):|x| = n}. 

Note that in the 1-dimensional case, there is only one set of this form, 
{1,2,. ..n}. 
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Corollary 2.4 If Condition [| holds for some if) then there exists d(m) — > 
as m — > oo such that for m > n: 

m ,. s n ,. x , i / J(m)mn 

J(n + m < J(m) + J(n) + d m - z/T 1 

Proof For any x, we can decompose the box into smaller distinct ones: 
B x = B y [J B z , where B y n B z = 0, and x = y = z for all but the jth 
coordinate, so that: 

B z = {u : < Ui < Xi for all i ^ j and %jj + 1 < < ?/j + ^ = Xj}. 



This corresponds to splitting the box into two smaller ones by making a 
cut parallel to the jth face. We write U z = (J2 u &b 

(E ue5 ^ T) )/ 



Taking (3 = m/ (m + n) = |y|/|x|, and by substituting in Theorem [T2 
J(m) < 1/r, we obtain 

uvU) < -^-j rt (vW) + -^uv^) 

m + n y n + m 



since 



m 



m + n 
Define 

c(m, n) = sup |Cov(C/y, f/ z ) : |y| = m, |z| = n| . 

Under the finite susceptibility condition (Condition |l|), Lemma 4 of New- 
man Jl3| shows that this quantity is bounded above in a suitable way, since 
Var (U u - C/ v ) -> if |u|/|v| -> 0. □ 

We are able to complete the proof of the CLT, under FKG conditions. 

Proof of Theorem |1.7| We first establish convergence along the 'powers of 2 
subsequence'. Condition |2] implies that Vn G for some ip and hence that 
J{vt ] ) < ^(A(K (T) , Vt\ 1/2)). We can write D(k) = v~\ J st {V^/4)). By 
Corollary |2.4|, we know that: 



J(2 fc+1 ) < J{2 k ) + d(k) -D(k), 

where d{k) — > 0. 

We use an argument structured like Linnik's proof JET]]. Given e, we can find 
such that d(k) < e/2, for all k > K. Now either: 
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1. For all k > K, 2d{k) < D{k), so J{2 k ) - J{2 k+1 ) > D{k)/2, and 
summing the telescoping sum, we deduce that J2 k D(k) is finite, and 
hence there exists L such that D(L) < e. 

2. Otherwise for some L > K, 2d(L) > D(L), then D(L) < e. 

Thus, in either case, there exists L such that D(L) < e, and hence by Propo- 
sition p|, J(2 L ) < 4z/(e). 

Now, for any k > L, either J(2 k+1 ) < J(2 k ), or D(k) < d(k) < e. In the 
second case, J{2 k ) < 4i/(e), so that J{2 k+1 ) < 4z/(e) + e. In either case, we 
prove by induction that for all k > L, that J{2 k+1 ) < 4z/(e) + e. 

Now, we can 'fill in the gaps' to gain control of the whole sequence, adapt- 
ing the proof of the standard sub-additive inequality, using the methods 
described in Appendix 2 of |5|. □ 



3 Proof of sub-additive relation 

This is the key part of the argument, proving the bounds at the heart of the 
limit theorems. However, although the analysis is somewhat involved, it is 
not too technically difficult. 

We introduce notation where it will be clear whether densities or score func- 
tions are associated with joint or marginal distributions, by their number of 
arguments: px(x) will be the score function of X, and p' x { x ) the derivative 
of its density. For joint densities px,Y(%,y)> P^xy^iV) wm be the deriva- 
tive of the density with respect to the first argument and p^)y(x,y) = 
Px]Y( x >y)/Px,Y{x,y), and so on. 

Note that a similar equation to the independent case tells us about the be- 
haviour of Fisher Information of sums: 

Lemma 3.1 If X , Y are random variables, with joint density p(x,y) , and 
score functions p^ Y an d Pxy then X + Y has score function p given by 



p(z) = E p { iUx,Y) 



X + Y 



E 



X + Y 
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Proof Since X + Y has density px+Y given by px+Y (z) = / p(z — U, y)dy, 
then: 

/dp 
— (z-y,y)dy. 

Hence dividing, we obtain that: 

Px+y{z) J ' Px+y{z) 
as claimed. □ 

For given a, b, define the function M(x,y) = M ab (x,y) by: 

M(x, y) = a (p ( x, Y {x, y) ~ Px(xj) + b ($ iY {x, v) ~ My)) , 

which is zero if X and Y are independent. We will show that if Cov(X, Y) 
is small, then M is close to zero. 

Proposition 3.2 If X,Y are random variables, with score functions px,Py, 
and if the sum \fj3X + y/l — f3Y has score function p then 

PJ(X) + (1 - P)J(Y) - J ^X + y/1 - pY^j 

+2^(3(1 -(3)Epx(X)p Y (Y) + EM^(X, Y)p(X + Y) 
= E^ Px (X) + y /T=p PY {y)-p(y/px+ yfT^Y^y 

Proof By the two-dimensional version of Stein's equation, for any function 

f(x,y): 

Ep%r(X,Y)f(X,Y) = -E^-(X,Y). 
In particular, if f(x, y) = p(x + y): 

E Px (X)p(X + Y) = -Ef?(X + Y) - E(j%]y(X, Y) - p x (X))p(X + Y). 
Hence, we know that for any a, b: 

E(a Px (X) + bp Y (Y))p(X + Y) = (a + b)J(X + Y) - EM a , t (X, Y)p(X + Y). 
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By considering J p(x, y) (apx{x) + bp Y {y) — {a + b)p(x + y)) 2 dxdy, dealing 
with the cross term with the expression above, we deduce that: 

a 2 J(X) + b 2 J{Y) - (a + b) 2 J(X + Y) 

+2abEp x (X)py(Y) + 2 (a + b)EM a , b (X, Y)p{X + Y) 
= E (ap x (X) + bp Y {Y) - (a + b)p(X + Y)) 2 > 0. 

As in the independent case, we can rescale, and consider X' = y/]3X, Y' = 
y/1 — /3Y, and take a — (3, b — 1 — (5. Note that \f/3px'(u) = px{u/\f]5), 
VT^Pp Y i(v)=p Y (v/VT=P). □ 



We will show that the two terms on the second line of Proposition |3.2| can 
be controlled when (X, Y) = (S + Z { j\T + Z^ } ), by controlling Cov(S,T) 
alone. We need control of the score functions of perturbed variables. We 
obtain this in two regions, firstly in Lemma [O] over the tail, and then in 
Lemma 13.41 over the rest of the real line. 



We require an extension of Lemma 3 of Barron applied to single and 
bivariate random variables: 

Lemma 3.3 For any random variables S,T as before we define (X,Y) = 
(S + Z { s\t + Z { t ] ) and define p (2 $ for the density of(U,V) = (S + Z {2t \t + 
Z^). Now there exists a constant c T ^ = \f2(2k/re) k l 2 such that for all 

p%\x)\p x {x)\ k < c T , kP (2T \x) 
p {T) (x,y)|p^ y (x,t/)| fe < c Tik p ( ^{x,y) 
p (r) (x,y)|p? y (x,7/)|* < c Ttk p ( ^{x,y) 



x: 



and hence 



(E\p x (X)\ k ) 1/k < 



2 1 l k 2k 



re 



Proof We adapt Barron's proof, using Holder's inequality and the bound; 
(u/r) k (p T (u) < c Tj fc02r(w) for all u. 

p' x (x) k = [E[^^)4> T (x-S)' 





< [El ) <p T ( x - S) ) (E0 T (x - S)) k - 1 



< c r>fc (E<p 2T (x 
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A similar argument gives the other bounds. □ 

Now, the normal perturbation ensures that the density does not decrease too 
fast, and so the modulus of the score function can not grow too fast. By 
considering S normal, so that p grows linearly with u, we know that the B 3 
rate of growth is a sharp bound. 

Lemma 3.4 If S is a random variable with variance < K, forX = S+Zg~\ 
with score function p, for B > 1, there exists a function /i(r, K) such that: 



/ p{u) 2 du<h{T,K)B\ 

J-B^F 



Proof Now: p(u) > {2 ex~p(2K / t))' 1 4> T /2 {u) , so that for u G (-By/r, By/r), 
(By/^piu))- 1 < 2y^exp(B 2 + A/r)/B < 20Fexp( J B 2 + 4/r). Hence for any 
k > 1, by Holder's inequality: 

/ P {ufdu < / \p(u)\ 2k du) (2BVT) 1 - 1/k 

J-B^F \J-B^F J 

( f B ^ p(u)\p(u)\ 2k , Y /k , „ 



< 



'-B^F 2By/rmi u p(u) 

-j=A k [2\/2^exp{B 2 + 2K/r)J . 

Since we have a free choice of k > 1 to maximise k exp(v / k) , since here v > 1, 
taking = t> means that kexp(v/k) exp(— 1) = v. Hence we obtain a bound 
of 

/ p(u) du < — =- I 5 + + log 2^ < -— 3 + . 



□ 



We continue by considering Lb = {\x\ < By/r, \y\ < By/r}. 



(t) (t) 

Lemma 3.5 For random variables S,T , let X = S + Z s andY — T + Z^,. 
If S,T satisfy the FKG inequalities then there exists a function f 2 (r,K) such 
that for B > I: 

EM a:b (X,Y)p(X + Y)I((X,Y) G L B ) < / 2 (r, K)(a + b)B 4 Cov(S,T). 
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Proof Lemma 3 of Newman uses the fact that FKG inequalities im- 



ply 'positive quadrant dependence', originally due to Lehman [pTD| . That is, 



defining H(s,t) = ¥(S > s,T > t) - P(5 > s)P(T > t), S,T are positive 
quadrant dependent iff H(s, t) > for all s, £, which is a consequence of 5, T 
being FKG. Since Cov(S,T) = / H(s,t)dsdt > 0, then 

Coy (f(S),g(T)) = J f(s)g'(t)H(s,t)dsdt<\\f'\\ 00 \\g / \\ 00 Cov(S,T). 

Since |0 c (m)'| < exp(-l/2)/v / 2^rc 2 ~, and |(u^ c (u)/c)'| < (2 exp(-3/2))/v / 27rc3, 
we deduce that: 

|Px,y(a;,2/) -Px(x)p Y (y)\ < C " v ( g ' T ) ; 

Cov(S, T) 



bx?y ( s > v) ~ Px ( x )Py (y) I < 
\p%{x,y) - Px(x)p' Y (y)\ < 



7rr 5/2 e 2 
Cov(S, T) 



We can rearrange M aj b to give 

», , v (Pxrfay) - P'x( x )PY(y)\ fpx] Y (x,y) -px(x)p' Y (y) 

M ajb (x, y = a — ! p r +6 ? r 

y Px,Y{x,y) ) y px,Y{x,y) 

a r \,t. i w (Px{x)p Y {y) -p x ,Y(x,y)\ 
V Px,Y{x,y) J 

and hence writing c for Cov(S', T)/(27rr 5 / 2 e 2 ), f or (x, G L B : 

Px,Y(x,y)\M atb (x,y)\ < c(^e(ap x (x) + bp Y (y)) + 2(a + b)) . 

By Cauchy-Schwarz: 

\ M M %b{x , v)Kx + v) I {(x , v) ,L B)ixiy 

< cj (^fre(ap x {x) + bp Y (y)) + 2(a + b)) p(x + y)I((x,y) G L B )dxdy 

< c(a + b) ^2BV?/i^/l6BVf/i + v 7 !^^ IQB^f^j 
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This follows firstly since: 

/pB^F 
p x (x) 2 I((x,y) E L B )dxdy < (2By/¥) / p x (x) 2 dx < (2B^)B 3 f 1 (r, K) 
J-B.R 



and 



J P(x + y) 2 I((x,y) e L B )dxdy 

< J ' p(x + y) 2 I(\x + y\ < 2By/r)I(\y\ < B^fr)dxdy 

< 2B^ p(z) 2 dz < lQB 4 \^rfi(T, K) 

J-2Bs/t 



□ 



Lemma 3.6 For any random variables S,T with mean zero and variance 
< K , let X = S + Zg~^ and Y = T + Zj) . There exists a function fs(r, K, e) 
such that: 

EM a , b (X, Y)p(X + Y)(I((X, Y) i L B )dxdy <(a + b) e) . 

for S,T with kth moment (k > 2) bounded above, we can achieve a rate of 
decay of 1/B k ~ e . 

Proof By Chebyshev P ((S + zg r) ,T + Z {2r) ) £ L B )^j < jp^{x,y){x 2 + 
y 2 )/{2B 2 T)dxdy < (K+2r)/(B 2 r) so by H61der-Minkowski for l/p+l/q = 1: 

Ep%{X, Y)p(X + Y)I((X, Y) i L B ) 

< (E\pW Y (X,Y)\*>I((X,Y) i L B )) 1/P (E\p(X + Y)\^ 

< cl^F {{S + zf T \T + Zt ] ) i L B )) 1,P 
^ 2 v / 2exp(-l) /o . . ^_ 1 

By choosing p arbitrarily close to 1, we can obtain a constant term, as re- 
quired. The other terms work in a similar way. □ 

Similarly we bound the remaining product term: 
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Lemma 3.7 For FKG random variables S,T with mean zero and variance 
< K, let X = S + and Y = T + Zjr. There exist functions f^r, K) 
and fs(r,K) such that 

Ep x (X)p Y (Y) < f 4 {r,K)B 4 Cov(S,T) + fc(r,K)/B 2 . 



Proof Using part of Lemma we know that Px,y( x i v) ~ Px(x)py(u) < 
Cov(S, T)/(27rer 2 ). Hence by argument similar to those of Lemmas |3.6| and 
we obtain that: 

Ep x (X)p Y (Y) = I {vx,v{x,y) - Px(x)py(v)) Px(x)p Y (y)dxdy 



< —z — V - / \px{x)\\p Y {y)\I{{x,y) e L B )dxdy 



2irer 2 

+ / p(x,y)\p x (x)\\py(y)\I((x,y) £ L B )dxdy 



+ / p{x)p{y)\px{x)\\pY{y)\I{{x,y) £ L B )dxdy 



Cov(S, T) 

-2 ( / p x ,Y{x,y)\px(x)\ 2 I((x,y) £ L B )dxdy) 



as required. □ 



Proof of Theorem |2.2| Combining Lemmas p.5| . |3.6| and |3j] , we obtain for 



given K, r, e that there exist constants CV, C 2 such that 

EM #i ^p+ y/P{l-P)E PxPY < CtCoviS, T)B 4 + C 2 /B 2 -\ 

so choosing B = (K/Cov(S, T)) 1 / 6 > 1, we obtain a bound of CCov(S, T) 1 ^. 

By Lemma |3.6| , note that if X, Y have bounded kth moment, then we obtain 
decay at the rate C 1 Cov(S,T)B 4 + C 2 /B k \ for any k! < k. Choosing B = 
Cov(S,T)- 1 /( fc ' +4 ), we obtain a rate of Cov(S, T) k '^ k ' +A \ Hence for k -> 00 
we can find a rate arbitarily close to Cov(S', T). □ 
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